Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Optimization and implementation of parallel FP-Growth algorithm based on Spark
GU Junhua, WU Junyan, XU Xinyun, XIE Zhijian, ZHANG Suqi
Journal of Computer Applications    2018, 38 (11): 3069-3074.   DOI: 10.11772/j.issn.1001-9081.2018041219
Abstract972)      PDF (928KB)(636)       Save
In order to further improve the execution efficiency of Frequent Pattern-Growth (FP-Growth) algorithm on Spark platform, a new parallel FP-Growth algorithm based on Spark, named BFPG (Better Frequent Pattern-Growth), was presented. Firstly, the grouping strategy F-List was improved in the size of the Frequent Pattern-Tree (FP-Tree) and the amount of partition calculation to ensure that the load sum of each partition was approximately equal. Secondly, the data set partitioning strategy was optimized by creating a list P-List, and then the time complexity was reduced by reducing the traversal times. The experimental results show that the BFPG algorithm improves the mining efficiency of the parallel FP-Growth algorithm, and the algorithm has good scalability.
Reference | Related Articles | Metrics